Skip to content

Conversation

@ronpal
Copy link
Collaborator

@ronpal ronpal commented Dec 2, 2025

Description

The goal of this change is to improve the performance and overall user experience of the build command. It should also result in more maintainable code.

Flow

The general flow aims to collect and validate in stages, eventually doing more stuff in parallel and finally running recommendations and fixes on the built state. It might be better to run optimizations on individual modules, so this may change.

Structure

We'll need a new set of data classes/structure for recommendations. A recommendation will have a unique code so that we can link to docs, but also so that the user can dictate which rules and recommendation that they want to show (yes, like ruff and mypy) in their cdf.toml.

A recommendation, if fixable,

Collecting and logging warnings

Warnings should be collected and grouped by category so that we can print (or write to file) more readable logs;

aggregated
NEAT001 (found 10, fixed 10)
NEAT002 (found 4, fixed 2)
NEAT003 (found 3)
detailed:
NEAT001
   path/to/module/or/file [NOT FIXED]
   path/to/module/or/file [FIXED]

Exit on warning still applies, but not before warnings have been printed or logged. Exceptions are raised as usual.

Testing

TBD. We'll need to do full regression testing here, in addition to adding new unit tests.

Release

This change will be behind a alpha flag until we decide to release it (planned v0.8)

Bump

  • Patch
  • Skip

Changelog

Added

  • [alpha] build v2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not part of the task, strictly speaking. Just a proposal from the AI to improve performance through compiling regex'es. Will split into separate task later, please ignore for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean it will not be part of this PR?

If you want to do that performance boost, can you check that it is actually a boost? Introducing threading and locks increases the complexity of the code.

Copy link
Collaborator

@doctrino doctrino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done a first pass and thrown in some comments as I went through. I think I understand the code, but I am not seeing completely were it is heading. But more than good enough to start a discussion :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good refactoring that can be moved out to a separate PR.

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
27971 23683 85% 80% 🟢

New Files

File Coverage Status
cognite_toolkit/_cdf_tk/commands/build_v2/init.py 100% 🟢
cognite_toolkit/_cdf_tk/commands/build_v2/build_cmd.py 85% 🟢
cognite_toolkit/_cdf_tk/commands/build_v2/build_input.py 96% 🟢
cognite_toolkit/_cdf_tk/commands/build_v2/build_issues.py 91% 🟢
TOTAL 93% 🟢

Modified Files

File Coverage Status
cognite_toolkit/_cdf_tk/commands/init.py 100% 🟢
cognite_toolkit/_cdf_tk/commands/build_cmd.py 80% 🟢
cognite_toolkit/_cdf_tk/data_classes/_build_variables.py 93% 🟢
cognite_toolkit/_cdf_tk/validation.py 96% 🟢
TOTAL 92% 🟢

updated for commit: ad34daa by action🐍

@codecov
Copy link

codecov bot commented Dec 3, 2025

Codecov Report

❌ Patch coverage is 89.34426% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.66%. Comparing base (ca5e8c0) to head (ad34daa).

Files with missing lines Patch % Lines
...ite_toolkit/_cdf_tk/commands/build_v2/build_cmd.py 85.00% 15 Missing ⚠️
cognite_toolkit/_cdf_tk/validation.py 84.00% 4 Missing ⚠️
cognite_toolkit/_cdf_tk/commands/build_cmd.py 40.00% 3 Missing ⚠️
...e_toolkit/_cdf_tk/commands/build_v2/build_input.py 95.55% 2 Missing ⚠️
..._toolkit/_cdf_tk/commands/build_v2/build_issues.py 90.90% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2273      +/-   ##
==========================================
+ Coverage   84.63%   84.66%   +0.03%     
==========================================
  Files         281      284       +3     
  Lines       27762    27971     +209     
==========================================
+ Hits        23496    23683     +187     
- Misses       4266     4288      +22     
Files with missing lines Coverage Δ
cognite_toolkit/_cdf_tk/commands/__init__.py 100.00% <ø> (ø)
...e_toolkit/_cdf_tk/data_classes/_build_variables.py 92.74% <100.00%> (+1.38%) ⬆️
...e_toolkit/_cdf_tk/commands/build_v2/build_input.py 95.55% <95.55%> (ø)
..._toolkit/_cdf_tk/commands/build_v2/build_issues.py 90.90% <90.90%> (ø)
cognite_toolkit/_cdf_tk/commands/build_cmd.py 79.87% <40.00%> (-0.81%) ⬇️
cognite_toolkit/_cdf_tk/validation.py 95.79% <84.00%> (-3.15%) ⬇️
...ite_toolkit/_cdf_tk/commands/build_v2/build_cmd.py 85.00% <85.00%> (ø)

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ronpal
Copy link
Collaborator Author

ronpal commented Dec 3, 2025

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring of the build command, creating a new v2 implementation behind a feature flag. The changes are well-structured, moving validation logic into a dedicated function and introducing Pydantic models for build inputs and issues, which aligns with the repository's style guide. The performance of variable replacement is also improved by caching compiled regular expressions.

My main concern is that the new v2 build command appears to have lost support for packages defined in cdf.toml, which is a functional regression. I've left a comment with a suggestion on how to restore this functionality.

ronpal and others added 2 commits December 3, 2025 16:43
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@ronpal ronpal marked this pull request as ready for review December 3, 2025 15:53
@ronpal ronpal requested review from a team as code owners December 3, 2025 15:53
Copy link
Collaborator

@doctrino doctrino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall good structure, a few clarifying questions and suggestions.


cmd = BuildCommand(print_warning=print_warning)
cmd = (
BuildCommandV2(print_warning=print_warning)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you be able to reuse the old CLI interface? I would expect that to change, but I might be mistaken.

description: str


class BuildIssueList(RootModel[list[BuildIssue]]):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you subclass UserList you do not have to repeat all the standard list method (append, extend, len)

warnings.append(environment_warning)
return config, warnings

@property
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good case for cached property?

user_selected_modules = self.config.environment.get_selected_modules({})
return ModuleDirectories.load(self.organization_dir, user_selected_modules)

@property
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good case for cached property?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be committed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

return f"{prefix}{suffix}"


def validate_module_selection(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice refactoring. As this have a lot of logic, it should have unit tests. Can you add a task for it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean it will not be part of this PR?

If you want to do that performance boost, can you check that it is actually a boost? Introducing threading and locks increases the complexity of the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants